For this assignment, I have chosen to investigate results data from the International Mathematics Olympiad from 1984-2017. This eaier-to-use data set was downloaded from Kaggle.com, but the full results can also be accessed directly imo-official.com. This data set includes the names, scores, ranks and awards of each individual who participated in the IMO from 1984 until 2017.
#load libraries
library(dplyr)
library(ggplot2)
library(ggthemes)
library(plotly)
library(htmlwidgets)
library(viridis)
First I will create a plot of the highest scorer each year against the competition year for both the USA and Germany, without any dynamic elements.
imo.results <- read.csv('imo_results.csv') #read data file
plot.USA <- imo.results %>%
filter(country == "USA") %>% #filter data to USA participants
group_by(year) %>%
slice(which.max(total)) %>% #choose the highest scorer from the USA each year
ggplot(aes(x = year,
y = total,
color = rank,
text1 = firstname,
text2 = lastname)) +
geom_point(shape = 16) +
scale_color_viridis() +
xlab(NULL) +
ylab("Total Point Score (0-42)") +
ggtitle("Highest Scoring Americans at International Mathematics Olympiad",
subtitle = "Competition Years 1984-2017") +
scale_x_continuous(breaks = seq(1984, 2017, by = 6))
plot.GER <- imo.results %>%
filter(country == "GER") %>% #filter data to GER participants
group_by(year) %>%
slice(which.max(total)) %>% #choose the highest scorer from GER each year
ggplot(aes(x = year,
y = total,
color = rank,
text1 = firstname,
text2 = lastname)) +
geom_point(shape = 16) +
scale_color_viridis() +
xlab(NULL) +
ylab("Total Point Score (0-42)") +
ggtitle("Highest Scoring Germans at International Mathematics Olympiad",
subtitle = "Competition Years 1984-2017") +
scale_x_continuous(breaks = seq(1984, 2017, by = 6))
plot.USA
plot.GER
Now I will use plotly to add interactive elements to this graphic and display the two plots side by side.
dplot.USA <- ggplotly(plot.USA,
tooltip = c("text1", "text2", "y", "colour"))
dplot.GER <- ggplotly(plot.GER,
tooltip = c("text1", "text2", "y", "colour"))
subplot(dplot.GER, dplot.USA, shareY = TRUE) %>% layout(title ="High Scorers at IMO from Germany and the USA")
Data Source: https://www.kaggle.com/luckyt/imo-scores